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Abstract — In many channel measurement applications, one 
needs to estimate some characteristics of the channels based on 
a limited set of measurements. This is mainly due to the highly 
time varying characteristics of the channel. In this contribution, 
it will be shown how free probability can be used for channel 
capacity estimation in MIMO systems. Free probability has 
already been applied in various application fields such as digital 
communications, nuclear physics and mathematical finance, and 
has been shown to be an invaluable tool for describing the 
asymptotic behaviour of many large-dimensional systems. In 
particular, using the concept of free deconvolution, we provide 
an asymptotically (w.r.t. the number of observations) unbiased 
capacity estimator for MIMO channels impaired with noise called 
the free probability based estimator. Another estimator, called 
the Gaussian matrix mean based estimator, is also introduced 
by slightly modifying the free probability based estimator. This 
estimator is shown to give unbiased estimation of the moments 
of the channel matrix for any number of observations. Also, the 
estimator has this property when we extend to MIMO channels 
with phase off-set and frequency drift, for which no estimator 
has been provided so far in the literature. It is also shown that 
both the free probability based and the Gaussian matrix mean 
based estimator are asymptotically unbiased capacity estimators 
as the number of transmit antennas go to infinity, regardless 
of whether phase off-set and frequency drift are present. The 
limitations in the two estimators are also explained. Simulations 
are run to assess the performance of the estimators for a low 
number of antennas and samples to confirm the usefulness of the 
asymptotic results. 

Index Terms — Free Probability Theory, Random Matrices, 
deconvolution, limiting eigenvalue distribution, MIMO. 



I. Introduction 

Random matrices, and in particular limit distributions of 
sample covariance matrices, have proved to be a useful tool 
for modelling systems, for instance in digital communications 
H], nuclear physics Q and mathematical finance |3|. A typical 
random matrix model is the information-plus-noise model. 



aX„)(R„ + aX„)^. 



(1) 



R„ and X„ are assumed independent random matrices of 
dimension nx N, where X„ contains i.i.d. standard (i.e. mean 
0, variance 1) complex Gaussian entries. ([T]i can be thought of 
as the sample covariance matrices of random vectors r„+crx„. 
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r„ can be interpreted as a vector containing the system charac- 
teristics (direction of arrival for instance in radar applications 
or impulse response in channel estimation applications). x„ 
represents additive noise, with a a measure of the strength 
of the noise. Classical signal processing estimation methods 
consider the case where the number of observations N is 
highly bigger than the dimensions of the system n, for which 
equation ([T]| can be shown to be approximately: 



W r 



(2) 



Here, r„ is the true covariance of the signal. In this case, 
one can separate the signal eigenvalues from the noise ones 
and infer (based only on the statistics of the signal) on the 
characteristics of the input signal. However, in many situations, 
one can gather only a limited number of observations during 
which the characteristics of the signal does not change. In 
order to model this case, n and N will be increased so that 



= c, i.e. the number of observations is increased 



lim,, 

at the same rate as the number of parameters of the system 
(note that equation (|2]i corresponds to the case c ~ 0). 

Previous contributions have already dealt with this problem. 
In |4l, Dozier and Silverstein explain how one can use the 
eigenvalue distribution of r„ — -^RnR^ to estimate the 
eigenvalue distribution of W„ by solving a given equation. 
In fSl, f6l, we provided an algorithm for passing between 
the two, using the concept of multiplicative free convolution, 
which admits a convenient implementation. The implementa- 
tion performs free convolution exactly based solely on mo- 
ments. 

In this paper, channel capacity estimation in MIMO systems 
is used as a benchmark application by using the connection 
between free probability theory and systems of type ([T]). 
For MIMO channels with and without frequency off-sets, 
we derive explicit asymptotically unbiased estimators which 
perform much better than classical ones. We do not prove 
directly that the proposed estimators work better than the 
classical ones, but present simulations which indicate that they 
are superior. We remark that the proposed capacity estimators 
will not be unbiased, it is needed that either the number of 
transmit antennas or the number of observations be large to 
obtain precise estimation. This limitation is most severe for 
channels with frequency off-sets, where it is needed in any 
case that the number of transmit antennas is large to obtain 
precise estimation. A case of study where channel estimation 
using free deconvolution has been used can be found in 121 
and H. 

This paper is organized as follows. Section |ll] presents the 
problem under consideration. Section |III] provides the basic 
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concepts needed on free probability, including free convo- 
lution. In section |IV| we formalize a new channel capacity 
estimator based on free probability, and explain some of 
the shortcomings for MIMO models with frequency off-sets. 
Another estimator, called the Gaussian matrix mean based 
estimator is then formalized to address the shortcomings of the 
free probability based estimator. We also present arguments for 
the Gaussian matrix mean based estimator performing better 
than the free probability based estimator, in some specific 
cases. These arguments are, however, not definite; we do 
not prove that one estimator is better than the other for 
the cases considered. The limitations of the estimators are 
also explained. The low rank of the channel (less than or 
equal to four) is the most notable limitation. In section |V] 
simulations of the estimators are performed and compared, 
where several quantities are varied, like the noise variance, 
rank and dimensions of the channel matrix, and the number of 
observations. In the following, upper (lower boldface) symbols 
will be used for matrices (column vectors) whereas lower 
symbols will represent scalar values, {.)'^ will denote transpose 
operator, (.)* conjugation and {.)^ — ((O"^) hermitian 
transpose. I„ will represent the n x n identity matrix. rr„ 
will denote the non-normalized trace on n x n matrices, while 
tr-n = ^Tvn denotes the normalized trace. Also, we will 
throughout the paper use c as a shorthand notation for the 
ratio between the number of rows and the number of columns 
in the random matrix model being considered. 

II. Statement of the problem 

In usual time varying measurement methods for MIMO 
systems, one validates models [9J by determining how the 
model fits with actual capacity measurements. In this setting, 
one has to be extremely cautious about the measurement noise, 
especially for far field measurements where the signal strength 
can be lower than the noise. 

The MIMO measured channel in the frequency domain can 
be modelled bv flOl. ifTDl 

H, = D[HD^ + aX,; (3) 

where H;, H and are respectively the n x m measured 
MIMO matrix (n is the number of receiving antennas, m is the 
number of transmitting antennas), the n x m MIMO channel 
and the n x m noise matrix with i.i.d. standard Gaussian 
entries. Note that we suppose the noise matrix Xj to be 
spatially white. In the realm of the channel measurements 
under study, the antenna outputs are connected to different 
RF (Radio Frequency) chains. As a consequence, for the case 
under study, the channel noise impairments are independent 
from one received antenna to the other. When one RF chain 
is used, the noise to be considered is not white. This case can 
also be studied within the framework of free deconvolution 
but goes beyond the scope of the paper We suppose that the 
channel H, although time varying, stays constant (block fading 
assumption) during L blocks. D[ and D* are nxn and mxm 
diagonal matrices which represent phase off-sets and phase 
drifts (which are impairments due to the antennas and not the 
channel) at the receiver and transmitter given respectively by 



(these are supposed to vary on a block basis) 

D[ = diag[e^'^i,...,eJ''^"]' and 
D* = diag[e^'''s...,eJ'''™] 

where the phases (f>'j and 6*^ are random. We assume all phases 
independent and uniformly distributed. 

We will also compare (O with the simpler model 

H, = H + aX„ (4) 

which is (O without phase off-sets and phase drifts. 

The capacity per receiving antenna (in the case where the 
noise is spatially white additive Gaussian and the channel is 
not known at the transmitter) of a channel with channel matrix 
H and signal to noise ratio p = is given by 

1 / I \ 1 " 1 

C = - log2 det I„ + ^HH^ = - E log2(l + —AO 

n \ ma- I n ^ — ' cr^ 

^ ^ 1=1 

(5) 

where A; are the eigenvalues of ^HH^. The problem consists 
therefore of estimating the eigenvalues of — HH^ based 
on few observations Hi, which is paramount for modelling 
purposes. Note that the capacity expression supposes that the 
channel is perfectly known at the receiver and not at the 
transmitter In practice, with the noise impairment, the channel 
will never be estimated perfectly and therefore expression (|5]l 
is not achievable. However, for MIMO modelling purposes, for 
which the capacity is often the matching metric, one needs to 
compare the capacity of the model with expression (|5]l. 

There are different methods actually used for channel ca- 
pacity estimation {V2\, |T3l, lfT4l . ifTSl . Usual methods discard, 
through an ad-hoc threshold procedure, all channels H, for 
which the channel to noise ratio (^tr„ (HH^)) is lower than 
a threshold and then compute 

C{a') = - log2 det I„ + { H.)(t7 E ^^)") 

\ 1=1 i=l / 

where M < L is the number of channels having a signal to 
noise ratio higher than the threshold. One of the drawbacks of 
this method is that one will not analyze the true capacity but 
only the capacity of the "good channels". Moreover, one has 
to limit the channel measurement campaign (in order to have 
enough channels higher than the threshold) only to regions 
which are close (in terms of actual distance) enough to the 
base station. 

Other methods, in order to have a capacity estimation at a 
given signal to noise ratio (different from the measured one 
with noise variance a^), normalize each channel realization H; 
and then compute for a different value of the noise variance 
(j\ (for example IQdB) the capacity estimate C{af). In the 
case where cr^ is high and af is low, one usually finds a high 
capacity estimate as one measures only the noise, which is 
known to have a high multiplexing gain. 

In this contribution, we will provide a neat framework, 
based on free deconvolution, for channel capacity estimation 
that circumvents all the previous drawbacks. Moreover, we 
will deal with model for which no solution has been 
provided in the literature so far. 
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III. Framework for free convolution 

Free probability |T6l theory has grown into an entire field 
of research through the pioneering work of Voiculescu in the 
1980's. Free probability introduces an analogy to the concept 
of independence from classical probability, which can be used 
for non-commutative random variables like matrices. These 
more general random variables are elements in what is called 
a noncommutative probability space. This can be defined by 
a pair {A. (p), where ^ is a unital *-algebra with unit /, and 
is a normalized (i.e. (/)(/) = 1) linear functional on A. The 
elements of A are called random variables. In all our examples, 
A will consist of n x n matrices or random matrices. For 
matrices, (p will be tr„. The unit in these ^-algebras is the 
n X n identity matrix I„. The analogy to independence is 
called freeness: 

Definition 1: A family of unital *-subalgebras {Ai)i^j will 
be called a free family if 



aj e A^ 



(t){ai) = 0(a2) = 



I ^n— 1 7^ '^n 

= 0(a„) = 



(oi • • • a„ 



0. 



(6) 

A family of random variables are said to be free if the 
algebras they generate form a free family. 

When restricting A to spaces such as matrices, or functions 
with bounded support, it is clear that the moments of a 
uniquely identify a probability measure, here called I'a, such 
that cj){a'^) = J x^dva{x). In such spaces, the distributions 
of ai + a2 and aia2 give us two new probability measures, 
which depend only on the probability measures associated 
with fli, 02 when these are free. Therefore we can define 
two operations on the set of probability measures: Additive 
free convolution rji ffl 772 for the sum of free random variables, 
and multiplicative free convolution rji S 7/2 for the product of 
free random variables. These operations can in many cases 
be used to predict the spectrum of sums or products of large 
random matrices: If ai„ has an eigenvalue distribution which 
approaches 771 and a2„ has an eigenvalue distribution which 
approaches 772, then in many cases the eigenvalue distribution 
of ain + a2n approaches r/i ffl 7/2- 

One important probability measure is the Marchenko Pastur 
law /ic 1 17 1, which has the density 



r-W^(l-Vw+ ^^-^'f (7) 

C I'KCX 

where (z)+ = max(0, z), a—{l~ \fcf, 5 = (1 + \fcf', and 
&q{x) is dirac measure (point mass) at 0. According to the 
notation in |T8l, /ic is also the free Poisson distribution with 
rate \ and jump size c. We will need the following formulas 
for the first moments of the Marchenko Pastur law: 



Jxf>''={x)dx = 1 
Jx^f''''{x)dx = c4 
J x^ft'- ix^dx = c2 



1 



1 



3c 

J x'^f''''{x)dx = c3 + 6c2 + 6c+l. 



(8) 



([8]l follows immediately from applying what is called the 
moment-cumulant formula |18|, to the /ree cumulants 1181 
of the Marchenko Pastur law /i^. The (free) cumulants of 
the Marchenko Pastur law are l,c,c^,c^, ... f5\. Cumulants 



and the moment-cumulant formula in free probability have 
analogous concepts in classical probability. 

/ic describes asymptotic eigenvalue distributions of Wishart 
matrices, i.e. matrices on the form -i-RR^, with R an ri x 
random matrix with independent standard complex Gaussian 
entries, and ^ ^ c. This can be seen from the following 
result, where the difference from (O vanishes when N 00: 

Proposition 1: Let X„ be a complex standard Gaussian tt-x 
N matrix, and set c = -S^. Then 



E 
E 

E 

E 



•J_-v- -vH) 



trn 
trn 



1 

c+ 1 



3c 4 
6c2 



1 



6c + 1 



5(l+c) 



(9) 

This will be useful later on when we compute mixed moments 
of Gaussian and deterministic matrices. The proof of proposi- 
tion [T] is given in appendix |B] 

We will also find it useful to introduce the concept of 
multiplicative free deconvolution: Given probability measures 
77 and 772. When there is a unique probability measure rji such 
that 77 = 771 Kl 772, we will write 771 = 77 □7/2, and say that 771 
is the multiplicative free deconvolution of 77 with 772. There 
is no reason why a probability measure should have a unique 
deconvolution, and whether one exists at all depends highly 
on the probability measure 772 which we deconvolve with. This 
will not be a problem for our purposes: First of all, we will 
only have need for multiplicative free deconvolution with fic, 
and only in order to find the moments of the channel matrix. 
The problem of a unique deconvolution is therefore addressed 
by an existing algorithm for free deconvolution [6|, which 
finds unique moments of ijSfic (as long as the first moments 
of 77 is nonzero). 

We will need the following definitions: 

Definition 2: By the empirical eigenvalue distribution of an 
71 x 71 random matrix X we mean the random atomic measure 



1 



3a„(x) 



where Ai(X), A„(X) are the (random) eigenvalues of X. 

Definition 3: A sequence of random variables a„i,a„2,-" 
in probability spaces (A„,0„) is said to converge in distri- 
bution if, for any mi,...,mr € N, ki,...,kr E {1,2,...}, we 
have that the limit <P,i{a^nki ' ' ' '^rTfc ) exists as ti ^ cx). 

To make the connection between models (|4|, ([3]) and model 
([T]i, we need the following result fS): 

Theorem 1: Assume that the empirical eigenvalue distri- 
bution of r„ = -^R„R^ converges in distribution almost 
surely to a compactly supported probability measure 77r. Then 
we have that the empirical eigenvalue distribution of W„ 
also converges in distribution almost surely to a compactly 
supported probability measure r]w uniquely identified by 

?7ivHMc = (»yrHMc) ffl (5^2, (10) 
where 5^2 is dirac measure (point mass) at <t^. 
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Theorem [T] can also be re-stated (through deconvolution) as estimation will be performed for each observation, taking the 
,, , _ , ^ mean of all the estimates at the end. 



When we have L observations Hi in a MIMO system as in 
(lU or (|3]l, we will form the n x mL random matrices 



Hi...L — Hi...L + — ^Xi...i 



(11) 



with 



Hi, H2, Hl 



Hi i = [D[HD*,D[HD*, ...,D[HD*] , 

~ [Xi, X2, Xi] . 

This is the way we will stack the observations in this paper It 
is only one of many possible stackings. A stacking where the 
ratio between the number of rows and the number of columns 
converges to a quantity between and 1 would allow us to use 
theorem [T] (which implicitly assumes < c < 1) directly to 
conclude almost sure convergence, which again would help 
us to conclude that the introduced capacity estimators are 
asymptotically unbiased. Such a stacking can also reduce 
the variance of the estimators. Even though the stacking 
considered here may not give the lowest variance, and may 
not give almost sure convergence, we show that its variance 
converges to and provides asymptotic unbiasedness for the 
corresponding capacity estimator 
For the case L = 1, the formula 



tr„ ( (DJHD*i(D^HD*i 



tr. 



{{nn^y) (12) 



can be combined with theorem [T] to give the approximation 



(13) 



for a single observation. This approximation works well when 
n is large. For many observations, note that Hi . iH^ ^ — 
HH^ when there is no phase off-set and phase drift, so that 
the approximation 



(14) 



applies and generalizes ( fT3] l. The ratio between the number of 
rows and columns in the matrices Hi Xi...^ and Hi...l is 
c = considering the horizontal stacking of the observa- 
tions in a larger matrix. It is only this stacking which will be 
considered in this paper. 

When phase off-set and phase drift are added, it is much 
harder to adapt theorem[T]to produce the moments of ^HH^. 
The reason is that theorem [T] really helps us to find the 
moments of — Hi ^H^ ^. In the case without phase off- 
set and phase drift, this is enough since these moments are 
equal to the moments of ^HH^. However, equality between 
these moments does not hold when phase off-set and phase 
drift are added. A procedure for converting between these 
moments may exist, but seems to be rather complex, and 
will not be dealt with here. In section HV] we will instead 
define an estimator for the channel capacity which does not 
stack observations into the matrix Hi ^ at all. Instead, an 



IV. New estimators for channel capacity 

In this section, two new channel capacity estimators are 
defined. First, a free probability based estimator is introduced, 
which (for model (|4|i) will be shown to be asymptotically 
unbiased w.r.t. the number of observations. Then, by slightly 
modifying the free probability based estimator, we will con- 
struct what we call the Gaussian matrix mean based capacity 
estimator. This estimator will be shown, for model (|4|l and 
Q, to give unbiased estimates of the moments of the channel 
matrix for any number of observations. The computational 
complexity for the two estimators lies in the computation of 
eigenvalues and moments of the matrix HH^, in addition 
to computing the free (de)convolution in terms of moments. 
For the matrix ranks considered here, free (de)convolution 
requires few computations. The complexity in the computation 
of eigenvalues and moments of the matrix HH^ grows with 
n (the number of receiving antennas), which is small in this 
paper The computational complexity in the estimators grows 
slowly with the number of observations, since the dimensions 
of Hi . ^ does not grow with L. 

The two estimators are stated as estimators for the lower 
order moments of — HH^. Under the assumption that this 
matrix has limited rank (such as < 4 here), estimators for 
lower order moments can be used to define estimators for 
the channel capacity, since the capacity can be written as a 
function of the r lowest moments when the matrix has rank 
r, as explained below. 

A. The free probability based capacity estimator 

The free probability based estimator is defined as follows: 
Definition 4: The free probability based estimator for the 

capacity of a channel with channel matrix H of rank r, denoted 

C/, is computed through the following steps: 

1) Compute the first r moments hi,...,hr of the sample 
covariance matrix ^Hi . iHf'^ ^ (i.e. compute hj = 

tr„ (Un,,..Ln^ r)'^ for 1 < J < r), 

2) use (m to estimate the first r moments hfi, hfr of 
iHH», 

3) estimate the r nonzero eigenvalues Ai, of ^HH 
from hfi, hfr- Substitute these in (|5]). 

We also call hfi, ...,hfr the free probability based estimators 
for the r first moments of — HH^. 

rn 

Steps 2 and 3 in definition |4] need some elaboration. To 
address step 3, consider the case of a rank 3 channel matrix. 
For such channel matrices, only the lowest three moments hi, 
ho, hs of — HH^ need to be estimated in order to estimate 
the eigenvalues. To see this, first write 



C = i log2 det I 
= ^log2((l 



, + ^HH^) 



(15) 



where Ai, A2 and A3 are the three non-zero eigenvalues of 
— HH^. This quantity can easily be calculated from the 
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elementary symmetric polynomials 



ni(Ai,A2,A3) 

n2(Ai,A2,A3) 
n3(Ai,A2,A3) 



Ai + A2 + A3 

A1A2 + A2A3 + A1A3 

A1A2A3. 



by observing that 



1 



1 



rAi 



1 + ^A:; 



1 



rA3 



can be written as 

i+^ni(Ai, A2, A3)+^n2(Ai, A2, A3)+^n3(Ai, A2, A3). 

(16) 

ni(Ai,A2,A3) can in turn be calculated from the power 
polynomials 

^i(Ai,A2,A3) = Ai + A2 + A3 - ntr„ (iHH^) 
^2(Ai,A2,A3) - Xl + Xl + Xl = ntrr,({^tlH"y 
^3(Ai,A2,A3) = A? + A^ + A3=nir„((ijHH»)' 

by using the Newton-Gimrd formulas 1 19], which for the three 
first moments take the form Hi = 5*1, 112 — \ {Sl — S2) 
and 113 = i [Si ~ 7S1S2 + 2S3) . If the channel matrix has a 
higher rank r, similar reasoning can be used to conclude that 
the first r moments need to be estimated. In the simulations, 
the eigenvalues themselves are never computed, since compu- 
tation of the moments and the Newton-Girard formulas make 
this unnecessary. 

To address step 2 in definition H] a Matlab implementa- 
tion 1 20 1 which performs free (de)convolution in terms of 
moments as described in |6| was developed and used for the 
simulations in this paper Free (de)convolution is computation- 
ally expensive for higher order moments only: For the first four 
moments, step 2 in definition |4] is equivalent to the following: 

Proposition 2: Let /ii, /i2, ^3, hi and hfi, hf2, hf^^ hf^ be 
as in definition]?] Then 

hi = hfi+a^ 

h2 = hf2 + 2a^{l + c)hfi+a^{l + c) 
h = hf3 + 3(7^ {l + c)hf 2 + 3a^ch'j^ 

+3a^ (c2 +3c+ 1) hfi 

+0-6 (c2 + 3c + 1) 

hi — /l/4 + 4(7^(1 + c)/l/3 + 8CT^cft-/2/l/l 

+cr4(6c2 + 16c + 6)/i/2 

+4:a^{c^ + 6c^ + 6c+l)hfi 
(c^ + 6c2 + 6c + 1) , 

where c = ^V- 

rnL 

The proof of proposition ]2] can be found in appendix ]A] 
The following is the main result on the free probability 
based estimator, and covers the different cases for bias and 
asymptotic bias w.r.t. number of observations or antennas. 

Theorem 2: For L — 1 observation, the following holds for 
both models © and ©: 

1) hfi and ft, ^2 are unbiased, ft, ^3 and ft,j4 are biased, with 
the bias of h /3 given by 



In particular hf^ and hfi are asymptotically unbiased 
when m — > 00 (with n, L kept fixed), i.e. 



lim EQifj 



1 



-HH^ 



, 3 < J < 4. 



2) Cf is asymptotically unbiased when to ^ 00 (with 
n, L kept fixed) and ^HH^ has rank < 4, i.e. 
linim^oo Cf = C. 
For any number of observations L with model 0, the follow- 
ing holds: 

1) hfi and ft/2 are unbiased, hf^ and hfi are biased, with 
the bias of ft/3 given by 



In particular ft/3 and ft/4 are asymptotically unbiased 
when either to — > 00 or L — > cx) (with the other kept 
fixed), i.e. 

Jim £;(ft/,) = lim^Eihf,) = tr„ ((^H^)') 
for 3 < J < 4. 

2) Cf is asymptotically unbiased when either m ^ 00 
(with n, L kept fixed), or i ^ cx3 (with to, n kept 
fixed) and -^HJi^ has rank < 4, i.e. lim„i_»oo Cf = 
limi^oo Cf = C. 
The proof of theorem ]2] can be found in appendix ]C] The 
bias in theorem ]2] motivates the definition of the estimator of 
the next section. The free probability based estimator performs 
estimation as if the Gaussian random matrices and determinis- 
tic matrices involved were free. It turns out that these matrices 
are only asymptotically free (16], which explains why there 
is a bias involved, and why the bias decreases as the matrix 
dimensions increase. 



B. The Gaussian matrix mean based capacity estimator 

The expression for the Gaussian matrix mean based capac- 
ity estimator is motivated from computing expected values 
of mixed moments of Gaussian and deterministic matrices 
(lemma 1T]|. This results in expressions slightly different from 
(ITtI i. We will show that the Gaussian matrix mean based 
estimator can be used for channel capacity estimation in cer- 
tain systems where the free probability based estimator fails. 
The definition of the Gaussian matrix mean based capacity 
estimator is as follows for matrices of rank < 4: 

Definition 5: The Gaussian matrix mean based estimator 
for the capacity of a channel with channel matrix H of rank 
r < 4, denoted Cg, is defined through the following steps: 

1) For each observation, perform the following 

a) Compute the first r moments hii,...,hir of the 
sample covariance matrix ^HiHf^ (i.e. compute 



for I < j < r). 
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b) find estimates hii,hi2,hi3,hi4 of the first four 
moments of — HH^ by solving 

hii = hii + 

hi2 = h,2 + 2(7^{l + c)ha+a'^il + c) 
hi3 = h,3 + 3(7^(1 + c)ha+i<J^ch^n 



+3(7" 



3c +1 



hi4 + 4a-2(l + c)/ij3 + ^a'^chaKi 
( 



+4a6(c3 + 6c2 + 6c + 1 + ^-^)Ki 



6c^ + 6c + 1 



5(c+l) 



(18) 
where c = — , 

m 

Form the estimates h^j — J2i=i ^ij, 1 < J < f , of 

the first moments of — HH^, 
2) estimate the r nonzero eigenvalues Ai, ...A^ of ^HH^ 

from h^i, Substitute these in (|5]l. 
We also call hui,.--,hur the Gaussian matrix mean based 
estimators for the r first moments of — HH^. 

m 

While a Matlab implementation |20| of free (de)convolution 
is used for the free (de)convolution in the free probability 
based estimator, the algorithm for the Gaussian matrix mean 
based capacity estimator used by the simulations in this paper 
follows the steps in definition |5] directly. 

Note that ( fTSl l resemble the formulas in ([TtT i when c ^ 
c = — is used in definition l5] since the observation matrices 
Hi are not stacked together in a larger matrix in this case. 
Instead, a mean is taken of all estimated moments in step 1 of 
the definition. This is not an optimal procedure, and we use it 
only because it is hard to compute mixed moments of matrices 
where observations Hi of type (O are stacked together 

The following theorem is the main result on the Gaussian 
matrix mean based estimator, and shows that it qualifies for 
it's name. 

Theorem 3: For either model (01 or (|3]l, the following holds: 
1) The estimators hui, hu2, ^«3, hu4 are unbiased, i.e. 



E{huj) = tr„ 



1 



-HH^ 



1 < J < 4. 



2) Cg is asymptotically unbiased as to — > oo (with 
n, L kept fixed) when ^HH^ has rank < 4, i.e. 
lim^^oo Cg = C. 

3) In the case of i = 1 observation, hfi — h^i and /1/2 = 
hu2- In particular, Cf — Cg when — HH^ has rank 
< 2. 

The proof of theorem [3] can be found in appendix |C] 

C. Limitations of the two estimators 

We have chosen to define two estimators, since they have 
different limitations. 

The most severe limitation of the Gaussian matrix mean 
based capacity estimator, the way it is defined, lies in the 
restriction on the rank. This restriction is done to hmit the 
complexity in the expression for the estimator. However, the 



computations in appendix |C] should convince the reader that 
capacity estimators with similar properties can be written 
down (however complex) for higher rank channels also. Also, 
while the free probability based estimator has an algorithm ||6| 
for channel matrices of any rank, there is no reason why a 
similar algorithm can not be found for the Gaussian matrix 
mean based estimator also. The computations in appendix ICl 
indicate that such an algorithm should be based solely on 
iteration through a finite set of partitions. How this can be 
done algorithmically is beyond the scope of this paper. 

For the free probability based estimator the limitation lies 
in the presence of phase off-set and phase drift (model (O): 
When model ^ is used, the comments at the end of sectionlllll 
make it clear that we lack a relation for obtaining the moments 
of —Hi LHf r from the moments of — HH^. Without 
such a relation, we also have no candidate for a capacity 
estimator (capacity estimators in this paper are motivated by 
first finding moment estimators). In conclusion, the stacking of 
observations performed by the free probability based estimator 
does not work for model (O. Only the Gaussian matrix mean 
based estimator can perform reliable capacity estimation for 
many observations with model (O. The second limitation of 
the free probability based estimator comes from the inherent 
bias in its deconvolution formulas (fTTI l. The bias is only large 
when both to and L are small (see theorem |2]), so this point 
is less severe (however, channel matrices down to size 4x4 
occur in practice). The bias in the lower order moments is 
easily seen to affect capacity estimation from the following 
expansion of the capacity 



C 



In 2 Z^fe=l 



00 (-1)''+Vfcp'= 



(19) 



In 2 Z-^k=l k ' 

which can be obtained from substituting the Taylor expansion 

-.00 J, 



(20) 



fc=i 



into the definition of the capacity. Here p = 1 /cr^ is SNR, and 
TOfc are the moments of ^HH^. It is clear from ( fT9] ) that, at 
least if we restrict to small p, the expression is dominated 
by the contribution from the first order moments. If to is 
small we therefore first have a high relative error in the first 
moments after the deconvolution step, which will propagate to 
a high relative error in the capacity estimate for small p due 
to (fT9b . Thus, free probability based capacity estimation will 
work poorly for small m, L and p. The same limitation is not 
present in the Gaussian matrix mean based estimator, since its 
moment estimators are unbiased. 

The limitation on the rank can in some cases be avoided, if 
we instead have some bounds on the eigenvalues: If we instead 
knew that at most four of the eigenvalues are not "negligible", 
we could still use proposition |2] to estimate the capacity. This 
follows from results on the continuity of multiplicative free 
convolution, which has been covered in |21|. Such continuity 
issues are also beyond the scope of this paper 

V. Channel capacity estimation 

Several candidates for channel capacity estimators for (|4| 
have been used in the literature. We will consider the follow- 
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Fig. 1. Comparison of various classical capacity estimators for various 
number of observations, model j4). = 0.1, n = 10 receive antennas, 
m = 10 transmit antennas. The rank of H was 3. 



Fig. 2. Comparison of Cf and Cq for various number of observations, 
model (4). cr^ = 0.1, n = 10 receive antennas, m = 10 transmit antennas. 
The rank of H was 3. 



ing: 



C^i = ;rLEf=ilog2det (l„ + 



tH,H 



H 



C2 = i lo& det K + Eti H.Hf 

C3 = i log, dct (l„ + ^(1 Ef=i H.)(i Eii H.)^: 

(21) 

These will be compared with the free probability based (Cf) 
and the Gaussian matrix mean based (Cg) estimators. 

A. Channels without phase off-set and phase drift 

In figure [T] Ci, C, and C3 are compared for various number 
of observations, with = 0.1, and a 10 x 10 channel matrix 
of rank 3. It is seen that only the C3 estimator gives values 
close to the true capacity. The channel considered has no phase 
drift or phase off-set. Ci and C2 are seen to have a high bias. 

In figure |2] the same a and channel matrix are put to the 
test with the free-probability based and Gaussian matrix mean 
based estimators for various number of observations. These 
give values close to the true capacity. Both work better than 
C3 for small number of observations. 

The free-probability based estimator converges faster (in 
terms of the number of observations) for lower rank channel 
matrices. In figure [3] we illustrate this for 10 x 10 channel 
matrices of rank 3, 5 and 6. Simulations show that for channel 
matrices of lower dimension (for instance 6 x 6), we have 
slower convergence to the true capacity. 

B. Channels with phase off-set and phase drift 

In figure |4] the C3 estimator is compared with the free- 
probability based estimator, the Gaussian matrix mean based 
estimator and the true capacity, for various number of obser- 
vations, and with the same a and channel matrix as in figure [T] 
and|2] Phase off-set and phase drift have also been introduced. 



0.9 p 
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0.7- 



+ +++ ++ +^ +++++ + + + +++*+++ + + * + 



» 0.5 - 

to ®x 
O 0.4- 



O °o o o 



0"D O " OOr, „ O o 00 O 
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True capacity, ranl< 5 

^ C^, ranl< 5 

- - True capacity, rank 6 
, C„ rank 6 



30 40 50 60 70 
Number of observations 



Fig. 3. Cf for various number of observations, model j4j. = 0.1, n = 10 
receive antennas, rra = 10 transmit antennas. The rank of H was 3, 5 and 6. 



In this case, the free-probability based estimator and the C3- 
estimator seem to be biased. 

In figure |5] simulations have been performed for various a. 
Only L = 1 observation was used, n = 10 receive antennas, 
and m = 10 transmit antennas. The channel matrix has rank 
3. It is seen that the Gaussian matrix mean based capacity 
estimator is very close to the true capacity. There are only 
small deviations even if one observation is present, which 
provides a very good candidate for channel estimation in 
highly time- varying environments. The deviations are higher 
for higher a. 

In figure |6] we have also varied cr and used only one 
observation, but we have formed another rank 3 matrix with, 
n — A receive antennas, m = 4 transmit antennas. It is seen 
that the deviation from the true capacity is much higher in this 
case. We have in figure |7]increased the number of observations 
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Fig. 4. Comparison of capacity estimators which worked for model j4) for 
increasing number of observations. Model (5) is used, = 0.1, n = 10 
receive antennas, m = 10 transmit antennas. The rank of H was 3. 



Fig. 6. Cg for L = I observation, n = 4 receive antennas, m = 4 transmit 
antennas, with varying values of a. Model (5). The rank of H was 3. 
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Fig. 5. Cg for L = 1 observation, n = 10 receive antennas, m = 10 
transmit antennas, with varying values of cr. Model (3). The rank of H was 
3. 



Fig. 7. Cg for L = 10 observations, n = 4 receive antennas, m = 4 
transmit antennas, with varying values of cr. Model (5). The rank of H was 
3. 



to 10, and used the same channel matrix. It is seen that this 
decreases the deviation from the true capacity. 

Finally, let us use a channel matrix of rank 4. In this 
case we have to increase the number of observations even 
further to accurately predict the channel capacity. In figure [8] 
Gaussian matrix mean based capacity estimation is performed 
for a rank 4 channel matrix with n = 4 receive antennas, 
TO = 4 transmit antennas. 1 observation is performed. If we 
increase the number of observations, Gaussian matrix mean 
based capacity estimation is seen to go very slowly towards 
the true capacity. To illustrate this, figure |9] shows Gaussian 
matrix mean based capacity estimation for 10 observations on 
the same channel matrix. It is seen that this decreases the 
deviation from the true capacity. 



VI. Conclusion 

In this paper, we have shown that free probability provides 
a neat framework for estimating the channel capacity for 
certain MIMO systems. In the case of highly time varying 
environments, where one can rely only on a set of limited noisy 
measurements, we have provided an asymptotically unbiased 
estimator of the channel capacity. A modified estimator called 
the Gaussian matrix mean based estimator was also introduced 
to take into account the bias in the case of finite dimensions 
and was proved to be adequate for low rank channel matrices. 
Moreover, although the results are based on asymptotic claims 
(in the number of observations), simulations show that the 
estimators work well for a very low number of observations 
also. Even when considering discrepancies such as phase drifts 
and phase off-set, the algorithm, based on the Gaussian matrix 
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True capacity 




Note that ( l22l i can also be inverted to express the nij in terms 



0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 



Fig. 8. Cg for L = 1 observation, n = 4 receive antennas, m = 4 transmit 
antennas, with varying values of cr. Model (3). The rank of H was 4. 



— True capacity 




0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 



Fig. 9. Cg for L = 10 observations, n = 4 receive antennas, m = 4 
transmit antennas, with varying values of cr. Model (3)- The rank of H was 
4. 



mean based estimator, provided very good performance. Fur- 
ther research is being conducted to take into account spatial 
correlation of the noise (in other words, deconvolving with 
other measures than the Marchenko Pastur law). 

Appendix A 
The proof of proposition [2] 

Let {mi,ni2, ■■■) be the moments of 77, {AIi, M2, ■■■) the 
moments of 7/ K /ic. Then |,6J 



cMi 
CM4 



cm I 

CTO2 + c^mf 



CTO3 + 3c^m.im2 + c^mf 

CTO4 + 4c^mim3 + 2c^to| + Gc'^mf m2 + c 



rrii 



of the M, instead: 



cmi — cMi 
cm2 — CM2 
crn^, — cM^ 
cm 4^ = cM^ 



c'Ml 

3c^MiM2 

Ac^MiMs 



2c^Mf 



10c^M^M2 - hc^M^. 

(23) 



Note also that the moments of ffl 5^2 are 
^2 



mi 
m2 

mi 



■ 2a'^mi + 

- Scrims + Scr^TTli + cr6 

4(T^m3 + 6(7*7712 + ia^mi + 



(24) 



By the definition of the free probability based estimator, 

where the moments of r] are hi, h2, h^, .... Denoting by rji = 
?7H/x^, 772 = 7/1 ffl(5o.2, we have that = ?72Kl/i^. 

Denote also the moments of r]i by r^, the moments of 772 by Si, 
and as before the moments of _j^H^ ^ by /ii, /72, /73, .... 
Write also c = as in proposition |2l For the third moment, 
we can apply (l22b . ( |24| ) and ( |23] | in that order, 

/l3 = S3 + 3cSiS2 + C^Si 

= r3 + 3CT2r2 + 3cr'*ri + 

+3c(ri + CT2)(r2 + 2o-2ri + ct*) 

+c2(ri + CT^)^ 
= r3 + 3crir2 + c^rf 

+3ct2(1 + c)r2 + (6c + 3c^)cr'^rl 

+a^{3 + 9c + 3c^)ri + cr6(l + 3c + c^) 

= /73 - 3c/li/l2 + 2c2/if 

+ 3c/li(/l2 - c/i?) + C^/lf 

+ (6C + 3c2)ct2/i2 _^ 3^2(^ _^ _ ^^2^ 

+cr''(3 + 9c + 3c2)/li + 0-6(1 + 3c + c2) 

= /73+ 3^2(1 +c)/l2+30-2c/l2 

+3ct* (c2 + 3c + 1) /ii + (c^ + 3c + 1) , 

which is the third equation in ([TtT i of proposition |2] Calcula- 
tions are similar for the other moments, but more tedious for 
the fourth moment. 



Appendix B 
The proof of proposition [T] 

In all the following, the matrices are of dimension nxN. We 
need some terminology and results from [221 for the proof of 
proposition [T] Let Sp be the set of permutations of p elements 
{1, 2, For vr £ Sp, let also vr be the permutation in S2p 

defined by 



H2j-l) = 27r-i(j), {j e{l,2,...,p}) 
7r(2j) = 27r(j)-l, {j e {1,2, ...,p}). 



(25) 



let denote the equivalence relation on {1, 2p} generated 
by the expression 



j^frT^ii) + 1, (addition formed mod. 2p), 



(26) 



(22) 



and let fc(7r) and 1{tt) denote the number of equivalence 
classes of consisting of even numbers or odd numbers. 
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respectively. Corollary 1.12 in f22l (slightly rewritten) states 
that 



Before we prove lemma [T] let us explain how it proves 



E 



N 



1 



nNP ^ 

ires 



(27) 



(|9]l can thus be proved by calculating all values of k{Tr) and 
for n \n Si, S2, S3 and S4. We prove here the case p — 3, 
to get an idea on how the calculations are performed. For the 
six permutations in 5*3 we obtain the following numbers by 
using ( I25] ) and ( |26l ): 





Equivalence classes of 


k{n) 


m 


(1,2,3) 


{{1,3, 5}, {2}, {4}, {6}} 


3 


1 


(1,3,2) 


{{1,3}, {2}, {4, 6}, {5}} 


2 


2 


(2,1,3) 


{{1,5},{2,4},{3},{6}} 


2 


2 


(2,3,1) 


{{1},{2,4,6},{3},{5}} 


1 


3 


(3,1,2) 


{{1,3,5},{2,4,6}} 


1 


1 


(3,2,1) 


{{1},{2,6},{3,5},{4}} 


2 


2 



1 



= l + 3f 



Here tt — {i,j,k) means that 7r(l) ~ i,7r(2) = J, 7r(3) = k. 
Putting the numbers into ( |27] | we get 

E[tr^{{±XX"Y)] 

{N^n + N^n'^ + N'^n'^ + Nn^ + Nn + N'^n'^) 
^ + I^ = l + 3c+c2 + ^, 

which is the third equation in (|9]l. We skip the computations 
for the other equations in (|9]l, since they are very similar and 
quite tedious, since Sp has pi elements. 

Appendix C 
The proof of theorems[2]and[3] 

We will first show the following: 

Lemma 1: For systems of type ([Hi, the following holds 
when R„ is deterministic: 



E 
E 
E 



tr„ (W„)] 

'tTn (W2 



i?[tr„(W4)] ^ 



ma 

+3^4 (c^ 

7714 + 4(7^(1 + c)m3 



2o-2(i + c)mi 
3(7^(1 + c)m2 
f 3c+H 
3c 



1 + 1^ 



+cr'^(6c^ + 16c- 
+14cr'*c(l + c)m\ 



^)"^i 
8a^cm2mi 



+4(7^ (c^ 



where m, = tr„ 



-6c2 
6c2- 



-6c- 
6c- 



5(c+l) 
5(c+l) 



)mi 



(28) 



We remark that it is the assumption that X,, 
which makes the mixed moments E [tr„ (W^J 



is Gaussian 
expressible 



in terms of the individual moments . Without the Gaussian 
assumption, there is no reason why such a relationship should 
hold. Also, while our statements are made only for the four 
first moments, we remark that similar relationships can be 
written down for higher moments also, which deviate from 
corresponding free probability based estimates only in terms 
of the form (that the deviation terms are on this form is 
actually a consequence of theorem 1.13 of |[22l ). 



theorems |2] and |3] We substitute raL for (i.e. 



mL 



) for 



the case of L observations, m for A^ (i.e. c——) for the case 
of one observation, and Hi ^ for Rji in lemma[T] Since the 
first two equations in ( |28] ) coincide with the corresponding first 
two formulas in ( [TtT i and ( fTSl ). we see that the free probability 
based and the Gaussian matrix mean based estimators coincide 
for the first two moments in the case of only one observation, 
and that they are both unbiased for these two moments 
(regardless of which model is used). This proves the third 
statement of theorem [3] and the statements on /i/i and /i/2 in 
theorem |2l 

The third and fourth formulas in ( fTsT i are seen to equal 
the third and fourth formulas in ( |28] |. which explains why the 
Gaussian matrix mean based estimator has no bias in the third 
and fourth moments, thereby proving the first statement of 
theorem [3] (model ^ is also addressed due to the relationship 
(fT2b). The bias in the free probability based estimator is easily 
found by noting that the only differences between the third 
formula in (fTTb and the third formula in ( l28T l are the terms 
m^L'^ mi and -;^rj^- This proves statements 2 in theorem |2] 

To see that Cg is asymptotically unbiased when m ^ 00 
(with n, L kept fixed), it is sufficient to prove that the variance 
of all moments ir„(W^') go to zero. This will remedy the fact 
that the capacity is a non-linear expression of the moments. 
The proof for this part is a bit sketchy, since a similar analysis 
of such variances has akeady been done more throughly in 
connection with the theory of second order freeness ll23l . We 
need to analyse 



(29) 



EUtr^i^D) - (i?(tr„(W^))) 



This analysis is very similar to the one in the proof of lemma[T] 
below: One simply associates each term in with a circle 
with 2k edges, and identify the edges which correspond to 
equal, Gaussian elements (this corresponds to the equivalence 
relation of appendixlBli. Computation of E (^(i7-n(W^j))^^ 

and (i?(tr„ (W^j)))^ is thus reduced to counting the number 
of terms which give rise to the different identifications of the 
edges on two circles (one circle for each trace). We need 
only consider identifications which are pairings, due to the 
statements in appendixiBlwhen the matrix entries are Gaussian 
(see also lIH, ll22l ). 

One sees immediately that the edge identifications which 
can be found in (£'(t7-„(Wj"j))) is a subset of the edge 

identifications which can be found in E (y{trn(^^Sf^ ■ These 
edge identifications therefore cancel each other in the ex- 
pression for the variance, and we may therefore restrict to 
edge identifications which only appear in E |^(tr„(W^))^y 
These correspond to the edge identifications where at least one 
identification across the two circles takes place. If we perform 
one such edge identification first, we are left with one circle 
with 4fc — 2 edges (when the two identified edges are skipped). 
After the identification of the remaining edges, the vertices can 
be associated with a choice among the elements {1, A^}, or 
a choice among the elements { 1 , . . . , n} (matching with matrix 
dimensions). Similarly as in appendix iBl let k{-k) denote the 
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number of vertices of the first type, l{7r) the number of vertices 
of the second type. It is clear that k{Tt) < 2k — 1 after the 
identification of edges. Since Af*=('^) < iV^''^^ is not enough 
to cancel the leading Af^'^-factor in E (^(trnCW^))^^ (recall 
that only N goes to infinity, not n), we conclude that ( |29l ) is 

(-^j, so that the variance of all moments go to as claimed, 
and we have established the second statement of theorem |3] 

Cf is, following the same reasoning, asymptotically unbi- 
ased when L ^ oo or m ^ cx) for model (|4]i, and when 

1 = 1 and m ^ oo for model (O. This proves the two 
second statements in theorem |2] which concludes the proof of 
theorems |2] and [3] 

Proof of lemma [TJ Two facts are important in the proof. 
First of all, if xi, ...^Xk are standard i.i.d. complex Gaussian 
random variables, then, according to remark 2.2 in L24J . 



Xl 



\3i 



unless ii = ji, 



--Jk- 
(30) 

Secondly, E{\xi\'^P) = pi for such xi,...,Xk- we remark 
that the proof presented here can be simplified by using the 
following trick, taken from 1221 : Rewrite a complex standard 
Gaussian random variable x to the form -j= (xi + • • • + Xg), 
where xi,...Xs are i.i.d. complex, standard, and Gaussian. 
( Il22l uses this trick, and lets s go to infinity). 

Set r„ = -i-R„R^. Let us first look at the case for the 
second moment. Note that 



E [trn (Wl)] 



+E 
+E 
+E 
+E 

+E 



tr„ 
tr„ 
tr„ 



)] 

2 1 

' w 

,2 1 

N2 



(^2 1 V uHji Y-ff^ 

(a4(iX„X«) 



(31) 



E[tr„{n2)], 



where the terms in ri2 have expectation zero due to ( l30l l. We 
see that 

• the first (deterministic) term is m2, matching the first term 
in the second equation of (|28] |. 

• The next-to-last term is a^{l+c), according to the second 
equation in (|9]l. This matches the last term in the second 
equation of (l28T l. 

• By direct computation, the second term is 

''^J^ E ^(R«(*,j)R,f(j,^)x„(fc,ox^ (/,*).) 

i.j,k,l 

This is nonzero only for = z, so that this equals 

= a^j^Nntrn (R„R^) 
= a2tr„ (^R„R^) = a^mi. 

• Similarly for the third term, which equals 

E.,,, E (R„(z, j)X^(j, fc)X(ft, l)R^il, ^)) 



2 1 
2 1 



a 

= a 



The fourth and fifth term equal the second and third due 
to the trace property, so that the sum of the contributions 
of the second to fifth terms are 2a^{l + c)mi, which 
matches the second term in the second equation of i 



Thus, contributions on the right hand side of dSTT i add up to 
the right hand side of the second equation in ( |28] l, proving the 
case for the second moment. 
For the third moment, write 

E [trn (W3)] = E [trn {Tl)] + a^E [tr^ ((aai + 032))] 



+a^E[trnmi+P32))]+'j''E 
+E [trn (ns)] 



tr,, 



(32) 



where the terms in 713 all have expectation zero, and 



(^31 — ■/^(X„X^R„R,^R„R^ + R„X^X„R^R„R^ 

I T> T>Hv vHt* T) H 1 T> jyH-T} vHv Ty H 
+ -tV,n-tV,„ ^„7\.„ tl,nrl,„ -f- JXrirL., tl,n^„ ^nrL„ 



^n^n ^n^n -^Ji^n 



H 



X„R^R„R^R„X^), 



«32 — -^(XjiR^RnX^RnR^ + R„X^R„R^X„R^ 
+R„R„ X„R„ R„X„ ), 

P3I — i^nRn ^n'^n ^n^n + -^nR-^R-riX^X^X^ 
I V Y-f/TJ TjHy Y-?^ I Y Y^^Y T> / / I) Y^ 

-^JS^n-^n -"-n-K-ri -^n-^n + ^ri^,, ^ii-K-n ^n-^n 

+X„X„ X„X„ R„R„ + R„X„ X„X„ X„R„ ), 

P32 — ■^(I^nX^X„R,^X„X^ + X„R^X„X^R„X^ 
+X„X„ R„X„ X„R„ ) 

(i.e. the terms in a^i^Psi have the terms X„,X^ adjacent to 
each other). We see that 

• the first and fourth terms in ( |32| | match the first and fifth 
terms on the right hand side of the third equation in 
(due to (111)). 

• Three of the terms in a^i are seen to contribute with 

J^Nntrn{{RnR^)')=m,, 



and the remaining three terms are seen to contribute 

1 //_ _ffN2N 



-nntr. 



(^(R„R^) 



Addition gives asi = 3(1 + c)m2. 
All terms in 0132 are seen to contribute 



cm2 



1 



{trn (RnR^))' = 



so that the total contribution is Scto^ 

Using the second formula in three terms in P^i are 

seen to contribute 

-^ntrn (R„R^) -n{l + c) = (1 + c)mi, 
Nn n 

and the remaining three terms contribute 



1 

Nn' 



-ntrn (R„R^) ^"(1 + c) = c(l + c)mi, 



Addition gives 3(c^ + 2c+ l)mi. 
All terms in /?32 are seen to contribute 

■^ntrn (R„R,f ) {nN-l) + -j:^ntrn (R„R,f ) x 2, 

where the factor 2 comes in since i^dxl"*) = 2 for a 
complex standard Gaussian random variable. Simplifying 
we get (c+ ]^)mi, so that the total contribution is 3(c + 
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Thus, contributions on the right hand side of ( [32] ) add up to 
the right hand side of the third equation in ( |28] l, proving the 
case for the third moment also. 

Now for the fourth equation in ( l28b . The details in this are 
similar to the calculations for the third moment, but much 
more tedious. The first term for the fourth moment formula in 
( [28] l is trivial, as is the last term which comes from the fourth 
formula in (|9]l. The second and third terms are calculated 
using exactly the same strategy as for the third moment. The 
remaining fourth, fifth and sixth terms require much attention. 
We address just some of these. 

Computing E [ir„ (cr^ (/?4i + /542))] gives the sixth term, 
where the terms in /34i are similar to those for /Jsi (i.e. the 
terms X„,X^ are adjacent to each other), i.e. four terms have 
the same trace as 



0' — ^^R-nR-^X„X^X„X^X„X^, 



complex, standard, and Gaussian). All in all we have that 

E[trr, (fT'(/341+/342))] = 

+ 3c + 1 - 
16(c+l) 



4^6(1 + 
cr^ 1 8c + Sc^ 



1 



mi + 



c3 



6c^ 



iV2 

6c + 1 



mi 
5(1 + c) 

iV2 



mi, 



which is the sixth term in the fourth equation of i 

The details for the fourth and fifth terms are dropped, g 
As can be seen, the requirement that R„ is deterministic 
is not strictly necessary in the proof of lemma [T] so that we 
could replace it with any random matrix independent from 



X„, the moment m,- with E 



with E 



[tr„ ((^R„R^)^ 



, and rrij 



while four terms have the same trace as 



b — -^X„R^R„X^X„X^X„X^, 



It is clear that E [ir„(a)] equals 
1 



ntr„ (R„Rf ) £; tr„ (X„X^ 



1 



tr„, I — R„R^ I 



c + 3c + 1 + ) mi, 




and that E [tr„(&)] equals 
1 



-ntrn (R„R^^) cE 



trn ((X„X^)- 



= ctr„ (^RnR^) E trr. |^(^lx„X^^ ^ 



c(c^ + 3c+l + — )mi, 



so that [341 = 4(1 + c) (c^ + 3c + 1 + mi. 

Similarly, for /342 (where the terms X„,X^ are not adjacent 
to each other), we need to address four terms which all have 
the same trace as 



1 



C — -rTrR-nX^XnR^XnX^X^X^, 



and four terms which have the same trace as 



1 



d — -^X„R^X„X^R„X^X„X^. 



By counting terms carefully, we see that these eight terms 
together contribute with ^8c + 8c^ + ^^^^t^'' ^ mi (during this 
count of terms, we need the fact that i?(|2;|°) = 6 when x is 



References 

[1] E. Telatar, "Capacity of multi-antenna gaussian channels," Eur. Trans. 

Telecomm. ETT. vol. 10, no. 6, pp. 585-596, Nov. 1999. 
[2] T. Guhr, A. Miiller-Groeling, and H. A. Weidenmiiller, "Random matrix 

theories in quantum physics: Common concepts," Physica Rep., pp. 190- 

299, 1998. 

[3] J. -P. Bouchaud and M. Potters, Theory of Financial Risk and Derivative 
Pricing - From Stati.stical Physics to Risk Management. Cambridge: 
Cambridge University Press, 2000. 

[4] B. Dozier and J. W. Silverstein, "On the empirical distribution of 
eigenvalues of large dimensional information-plus-noise type matrices," 
/ Multivariate Anal, vol. 98, no. 4, pp. 678-694, 2007. 

[5] 0. Ryan and M. Debbah, "Multiplicative free convo- 
lution and information-plus-noise type matrices," 2007, 
http://arxiv.org/abs/math.PR/0702342. 

[6] , "Free deconvolution for signal processing applications," 

Sulmitted to IEEE Trans. on Information Theory, 2007, 
http://arxiv.org/abs/cs.IT/0701025. 

[7] R. L. de Lacerda Neto, L. Sampaio, R. Knopp, M. Debbah, and 
D. Gesbert, "EMOS platform: Real-time capacity estimation of MIMO 
channels in the UMTS-TDD band," in International Symposium on 
Wireless Communication Systems, Trondheim, Norway, October 2007. 

[8] R. L. de Lacerda Neto, L. Sampaio, H. Hoffsteter, M. Debbah, D. Ges- 
bert, and R. Knopp, "Capacity of MIMO systems: Impact of polarization, 
mobility and environment," in IRAMUS Workshop, Val Thorens, France, 
January 2007. 

[9] J. P. Kermoal, L. Schumacher, K. I. Pedersen, P. E. Mogensen, and 
F. Frederiken, "A stochastic MIMO radio channel model with experi- 
mental validation," IEEE Journal on Selected Areas in Communications, 
vol. 20, no. 6, pp. 1211-1225, 2002. 

[10] T. PoUet, M. V. Bladel, and M. Moeneclaey, "BER sensitivity of OFDM 
systems to carrier frequency offset and wiener phase noise," IEEE Trans, 
on Communications, vol. 43, pp. 191-193, 1995. 

[11] P. H. Moose, "A technique for orthogonal frequency division multi- 
plexing frequency offset correction," IEEE Trans, on Communications, 
vol. 42, no. 10, pp. 2908-2914, 1994. 

[12] A. F Molisch, M. Steinbauer, M. Toeltsch, E. Bonek, and R. Thoma, 
"Measurement of the capacity of MIMO systems in frequency-selective 
channels," in IEEE 53rd Vehicular Technology Conference (VTC 2001 
Spring), 2001, pp. 204 - 208. 

[13] E. Bonek, M. Steinbauer, H. Hofstetter, and C. F. Mecklenbrauker, 
"Double-directional radio channel measurements - what we can derive 
from them," in URSI International Symposium on Signals, Systems, and 
Electronics (ISSSE'Ol), 2001, pp. 89 - 92. 

[14] H. Ozcelik, M. Herdin, H. Hofstetter, and E. Bonek, "A comparison of 
measured 8x8 MIMO systems with a popular stochastic channel model 
at 5.2ghz," in 10th International Conference on Telecommunications 
(ICr2003), 2003, pp. 1542 - 1546. 

[15] E. Bonek, N. Czink, V. Holappa, M. Alatossava, L. Hentila, J. Nuutinen, 
and A. Pal, "Indoor MIMO mearurements at 2.55 amd 5.25 ghz - a 
comparison of temporal and angular characteristics," in Proceedings of 
the 1ST Mobile Summit 2006, 2006. 



IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 1, NO. 1, JANUARY 2007 



13 



[16] F. Hiai and D. Petz, The Semicircle Law, Free Random Variables and 

Entropy. American Mathematical Society, 2000. 
[17] A. M. Tulino and S. Verdu, Random Matrix Theory and Wireless 

Communications . www.nowpublishers.com, 2004. 
[18] A. Nica and R. Speicher, Lectures on the Combinatorics of Free 

Probability. Cambridge University Press, 2006. 
[19] R. Seroul and D. O'Shea, Programming for Mathematicians . Springer, 

2000. 

[20] 0. Ryan, Tools for estimating channel capacity, 2007, 

http://ifi.uio.no/~oyvindry/channelcapacity/. 
[21] H. Bercovici and D. V. Voiculescu, "Free convolution of measures with 

unbounded support," Indiana Univ. Math. J., vol. 42, no. 3, pp. 733-774, 

1993. 

[22] U. Haagerup and S. Thorbj0rnsen, "Random matrices 
and K-theory for exact C*-algebras." [Online]. Available: 
"citeseer.ist.psu.edu/l 14210.html 

[23] B. Collins, J. A. Mingo, P. Sniady, and R. Speicher, "Second order 
freeness and fluctuations of random matrices: 111. higher order freeness 
and free cumulants," Documenta Math., vol. 12, pp. 1-70, 2007. 

[24] S. Thorbj0rnsen, "Mixed moments of Voiculescu's Gaussian random 
matiices," / Fund Anal, vol. 176, no. 2, pp. 213-246, 2000. 



